We are migrating the bug tracker to github Issues. This is now the preferred way to report NASM bugs.

Self-registration is disabled due to spam issue (mail gorcunov@gmail.com or hpa@zytor.com to create an account)

Bug 3392951 - NASM encoding choice fingerprint changed for xchg reg,reg
Summary: NASM encoding choice fingerprint changed for xchg reg,reg
Status: CLOSED FIXED
Alias: None
Product: NASM
Classification: Unclassified
Component: Assembler (show other bugs)
Version: 3.00.xx
Hardware: All All
: Medium annoyance
Assignee: nobody
URL:
Depends on:
Blocks:
 
Reported: 2025-09-04 05:07 PDT by E. C. Masloch
Modified: 2025-09-19 14:53 PDT (History)
5 users (show)

Obtained from: Built from git using configure
Generated by: Human
Bug category: Other, Unexpected or confusing behavior
Observed for: Production code
Regression: Yes (specify version below)
Regression since:
https://github.com/netwide-assembler/nasm/commit/4d9c102e44457738dd2ddfda89ef1a20c1729783


Attachments
Patch to undo the encoding choices difference (869 bytes, text/plain)
2025-09-04 05:07 PDT, E. C. Masloch
Details

Note You need to log in before you can comment on or make changes to this bug.
Description E. C. Masloch 2025-09-04 05:07:18 PDT
Created attachment 411944 [details]
Patch to undo the encoding choices difference

Comparing files generated by current git NASM to those generated by older versions finds differences that do not change the semantic meaning or length of the code (as determined by my ident86 tool). Apparently, xchg reg,reg instructions (with neither operand ax/eax) are encoded with the order of operands swapped.

I mentioned this some in https://bugzilla.nasm.us/show_bug.cgi?id=3392950

That thread has a small test case, too:

test$ cat test.asm

        xchg bx, dx
        xchg cx, dx
test$ nasm test.asm -l /dev/stderr
     1
     2 00000000 87DA                            xchg bx, dx
     3 00000002 87CA                            xchg cx, dx
test$ ndisasm test
00000000  87DA              xchg bx,dx
00000002  87CA              xchg cx,dx
test$ ~/proj/nasmtest/patch/nasm test.asm -l /dev/stderr
     1
     2 00000000 87D3                            xchg bx, dx
     3 00000002 87D1                            xchg cx, dx
test$ ~/proj/nasmtest/patch/ndisasm test
00000000  87D3              xchg bx,dx
00000002  87D1              xchg cx,dx
test$ ndisasm test
00000000  87D3              xchg dx,bx
00000002  87D1              xchg dx,cx
test$

This difference also affects lDebug's line assembler and disassembler, as I changed them to match the old NASM interpretation on 2024-04-23: https://hg.pushbx.org/ecm/ldebug/rev/ee8d326028d6

The git bisect points to the first bad commit as the 2023-10-12 https://github.com/netwide-assembler/nasm/commit/e993b75aa6dc59fbd083ddd7c2b97751fd7bacfc

With this patch the differences appear to be gone, without breaking anything:

diff --git a/x86/insns.dat b/x86/insns.dat
index a778265c..a3e74475 100644
--- a/x86/insns.dat
+++ b/x86/insns.dat
@@ -1456,10 +1456,10 @@ XCHG		reg64,reg_rax			[r-:	o64 90+r]				X86_64,LONG
 ; This must be NOLONG since opcode 90 is NOP, and in 64-bit mode
 ; "xchg eax,eax" is *not* a NOP.
 XCHG		reg_eax,reg_eax			[--:	o32 90]					386,NOLONG
-XCHG		reg8,reg8			[mr:	86 /r]					8086
-XCHG		reg16,reg16			[mr:	o16 87 /r]				8086
-XCHG		reg32,reg32			[mr:	o32 87 /r]				386
-XCHG		reg64,reg64			[mr:	o64 87 /r]				X86_64,LONG
+XCHG		reg8,reg8			[rm:	86 /r]					8086
+XCHG		reg16,reg16			[rm:	o16 87 /r]				8086
+XCHG		reg32,reg32			[rm:	o32 87 /r]				386
+XCHG		reg64,reg64			[rm:	o64 87 /r]				X86_64,LONG
 XCHG		mem,reg8			[mr:	hlenl 86 /r]				8086,SM,LOCK
 XCHG		mem,reg16			[mr:	hlenl o16 87 /r]			8086,SM,LOCK
 XCHG		mem,reg32			[mr:	hlenl o32 87 /r]			386,SM,LOCK
Comment 1 E. C. Masloch 2025-09-04 06:44:30 PDT
I tested building format.exe, share.exe, instsect.com and ldos.com with this patch applied and they're all completely binary identical.

The lDebug change is:

-xchg 90:3F _90:40 _L86:04 L86:85
+xchg 90:3F _90:40 L86:85 _L86:04

Underscore means it is only used by the assembler: https://hg.pushbx.org/ecm/ldebug/file/ee8d326028d6/source/instr.set#l43

The previous preferred and now dispreferred instruction key 04 is: https://hg.pushbx.org/ecm/ldebug/file/ee8d326028d6/source/instr.key#l30

04 OP_M_SRC_DST, OP_ALL+OP_RM, OP_ALL+OP_R	; add, adc, and, or, sub, ...

The newly preferred (both for assembler and disassembler) key 85 is:

85 OP_ALL+OP_R, OP_M_SRC_DST, OP_X, OPX_MOD_IF_REG_AX_EAX, OP_X, OPX_MOD_IF_OTHER_REG_AX_EAX, OP_ALL+OP_RM	; xchg

The key 85 is known to NASM as rm, whereas key 04 is mr.
Comment 2 H. Peter Anvin 2025-09-05 21:30:55 PDT
This is intentional: some CPU vendors have indicated that the "load" form (rm) under certain circumstances can perform better than the "store" (mr) forms.
Comment 3 E. C. Masloch 2025-09-06 00:39:53 PDT
(In reply to H. Peter Anvin from comment #2)
> This is intentional: some CPU vendors have indicated that the "load" form
> (rm) under certain circumstances can perform better than the "store" (mr)
> forms.

Look at the patch again. It deletes the mr in favour of rm, so unless you're mistaken in your wording here then "rm can perform better" should make you apply the patch.
Comment 4 E. C. Masloch 2025-09-06 00:43:24 PDT
Also, this is reg,reg cases so both xchg dx,cx and xchg cx,dx are equivalent surely? NASM will assemble one to the 87 CA and one to 87 D1, regardless of whether it prefers mr or rm (it just swaps which one is encoded as which). So I don't really understand how "rm can perform better" is true or relevant.
Comment 5 E. C. Masloch 2025-09-17 15:00:06 PDT
Hope this can still get picked up for the v3.00 release.
Comment 6 H. Peter Anvin 2025-09-19 13:01:53 PDT
This turned out to be a case of a significantly more serious problem, so thanks for bugging us about it.

Fix now checked in, will be in 3.00rc9.
Comment 7 E. C. Masloch 2025-09-19 14:49:54 PDT
Tested with a trivial test case, seems fine. msbio.bin also seems to match exactly now, ldos.com doesn't build because of a new bug: https://bugzilla.nasm.us/show_bug.cgi?id=3392958

Thanks for the fix!
Comment 8 E. C. Masloch 2025-09-19 14:53:38 PDT
Accidentally reset the status to open, setting it to closed fixed again. Sorry.